Team size matters: Collaboration and scientific impact since 1900

نویسندگان

Vincent Larivière

Cassidy R. Sugimoto

Andrew Tsou

Yves Gingras

چکیده

This paper provides the first historical analysis of the relationship between collaboration and scientific impact, using three indicators of collaboration (number of authors, number of addresses, and number of countries) and including articles published between 1900 and 2011. The results demonstrate that an increase in the number of authors leads to an increase in impact–-from the beginning of the last century onwards—and that this is not simply due to self-citations. A similar trend is also observed for the number of addresses and number of countries represented in the byline of an article. However, the constant inflation of collaboration since 1900 has resulted in diminishing citation returns: larger and more diverse (in terms of institutional and country affiliation) teams are necessary to realize higher impact. The paper concludes with a discussion of the potential causes of the impact gain in citations of collaborative papers. Introduction The notion of the lone genius is one of science’s keystone myths, simultaneously romantic and tidy. The quintessential example is that of Einstein conducting cutting-edge research while working as an examiner at the Bern patent office (Pyenson, 1985; Simonton, 2013). However, as with many myths, the “lone genius” legend is not entirely accurate. Scientific research has never been a strictly individual enterprise (Shapin, 1989), and even Einstein collaborated on several papers (Pyenson, 1985). In chemistry for example, one third of published papers had more than one author in 1900, a proportion that grew to 70% by the end of World War II (Gingras, 2010). In contemporary science, hyperauthorship is rife (Cronin, 2005) and it is rare for a single scientist to be responsible for a major theoretical breakthrough (Wuchty, Jones, & Uzzi, 2007). In the postWWII era, big science and the large amounts of money required for research have fostered an environment that encourages research collaboration and the accordant marginalization of the solitary “genius” (Simonton, 2010; Wuchty, Jones, & Uzzi, 2007). Most research leading to Nobel Prizes is also the result of collaboration (Zuckerman, 1967), despite the anachronistic fact that Nobel Prizes cannot be attributed to more than three individuals. The demise of the single-authored paper in scholarly communication had long been predicted (Price, 1963), and in the hard sciences, there is evidence that “much of the cutting-edge work these days tends to emerge from large, well-funded collaborative teams involving many contributors” due to the increasing specialization witnessed in all research fields (Simonton, 2013, p. 602). In parallel with the rise in the number of authors, we have also observed a growth in the number of internationally co-authored papers (Larivière, Gingras & Archambault, 2006; Sonnenwald, 2007), and many studies have shown a correlation between collaboration and impact at the micro, meso, and macro levels (Franceschet & Costantini, 2010; Narin, Stevens, & Whitlow, 1991). For example, Wuchty, Jones, and Uzzi (2007) found that while “solo authors did produce the papers of singular distinction...in the 1950s...the mantle of extraordinarily cited work has passed to teams by 2000” (p. 1038). Although a variety of studies have demonstrated a connection between scientific impact and the various types of collaboration, no study has yet looked at these relationships from a historical standpoint. The aim of this paper is to perform an analysis of relationship between collaboration and impact using a dataset of 32.5 million papers and 515 million citations received over the 1900-2011 period. With the aid of this historical dataset, we provide empirical data on the evolution of the various types of collaboration since 1900 and perform the first historical analysis of the effect of these various forms of collaboration on citation rates, assessing as well the role of self-citations. The objective is to understand, quantitatively, whether the relationship between collaboration and impact has been static across the century. Such stability would suggest a structural relationship between these two variables, that was unaffected by the rise of citation indices or the fervor of research assessment exercises in the late 20 century. Furthermore, analysis of the relationship between impact and collaboration is necessary to make evidencedbased decisions about the allocation of funding and other resources for team science. Following standard practice, we use co-authorship (that is, the presence of more than one author on the byline of a scientific publication) as our operationalization of the concept of scientific collaboration. However, the specific terminology has been disputed. Laudel (2002) argued that using co-authorship as a proxy for collaboration is based on the faulty assumption that all coauthors are also collaborators and, conversely, that all those who collaborated were listed as coauthors. Katz and Martin (1997) suggested that “[w]hat constitutes a collaboration therefore varies across institutions, fields, sectors and countries, and very probably changes over time as well” (p. 16). However, despite these caveats, “co-authorship in publications is widely considered as a reliable proxy for scientific collaboration” (Franceschet & Costantini, 2010, p. 541) and will be employed here. Background The trend of increasing co-authorship is not a new one nor is the study of the role of collaboration in science (e.g., Hagstrom, 1965). As early as 1963, Price predicted that scholarly publications will “move steadily toward an infinity of authors per paper” (p. 89). Recent studies have confirmed that co-authorship is becoming increasingly common across all disciplines (Cronin, Shaw, & Barre, 2003; Francheschet & Costanini, 2010; Galison, 2003; Larivière, Gingras, & Archambault, 2006; Persson, Glänzel, & Danell, 2004; Wuchty, Jones, & Uzzi, 2007). There is also an upwards trend in the number of authors credited on a paper, which sometimes reaches triple digits (Abramo, D’Angelo, & Di Costa, 2009). The collaboration advantage Several scholars have noted that collaboration is well-suited to the increasingly narrow focus scientific research (Franceschet & Costantini, 2010; Simonton, 2013), although it has been argued that increasing specialization cannot fully account for the growth in collaboration (de B. Beaver & Rosen, 1978). Other scholars have postulated that collaboration is the only practical solution when one considers the shortage of necessary resources (Wray, 2002), and it has been suggested that “easier access to public financing; aspirations for greater prestige and visibility resulting from collaboration with renowned research groups; and opportunities to attain higher productivity” are other factors that encourage collaboration (Abramo, D’Angelo, & Di Costa, 2009, p. 156). This is, however, not without its complications. For example, it has been suggested that there may be long-term negative effects when nations engage excessively in collaborations, rather than constructing their own research capabilities (Wagner & Leydesdorff, 2005). Price and de B. Beaver (1966) somewhat humorously suggested that the social function of collaboration is to provide “a method for squeezing papers out of the rather large population of people who have less than a whole paper in them” (p. 1015). Essentially, collaboration allows for people with different (and ideally complementary) skills to come together in order to solve a single problem (Franceschet & Costanini, 2010). This integration, of course, is not always seamless and can cause friction, wasted time, and possible threats to the quality of the work when understanding is not reached by all participants (Franceschet & Costantini, 2010). Collaboration also relies on a healthy balance of trust and bureaucracy (Shrum, Genuth, & Chompalov, 2007)— the “micropolitics of collaboration” (Atkinson, Batchelor, & Parsons, 1998, p. 260) that must be negotiated for productive collaboration. Furthermore, collaboration complicates notions of contribution and responsibility in publication (Birnholtz, 2006; Kennedy, 2003). Collaboration has been positively correlated with many metrics of academic quality (see Sugimoto, 2011 for a review). For example, collaboration has been shown to lead to higher productivity (Abramo, D’Angelo, & Di Costa, 2009; Bordons, Gomez, Fernandez, Zulueta, & Mendez, 1996; Landry, Traore, & Godin, 1996; Mairesse & Turner, 2005), with productivity increasing as team size increases (Adams, Black, Clemmons, Paula, & Stephan, 2005). The citation advantage of multi-authored papers is another incentive for scholars: many studies have demonstrated that co-authored papers tend to have higher citation impact than single-authored papers (e.g., Wuchty, Jones & Uzzi, 2007). Similarly, collaborations between industries and universities (Lebeau, Laframboise, Larivière, & Gingras, 2008) and international collaborations (Franceschet & Costantini, 2010; Glänzel, 2001; Katz & Hicks, 1997) have also been shown to yield, on average, higher scientific impact. The geography of collaboration Despite the “falling cost and growing ease of communication” among scientists (Katz & Martin, 1997, p. 8), there is evidence of a “‘proximity effect,’ whereby collaboration intensity is inversely proportional to the distance between the players at stake” (Abramo, D’Angelo, & Di Costa, 2009, p. 156; see also Cronin, 2008; Gieryn, 2002; Katz & Martin, 1997; Sugimoto & Cronin, 2012; Yan & Sugimoto, 2011). To incentivize scholars to collaborate across geographic boundaries, a number of institutional and governmental initiatives have been put into place (Abramo, D’Angelo, & Di Costa, 2009). Scholars have the potential to gain academic capital for engaging in collaboration, and a number of studies have demonstrated a citation advantage for articles co-authored across institutions and nations (see Ganzi, Sugimoto, & Didegah [2012] for a review of this work). However, this advantage is not universal. Frame and Carpenter (1979) suggested that international collaboration is more likely to be witnessed in “basic” fields, and that “extra-scientific factors (for example, geography, politics, language) play a strong role in determining who collaborates with whom in the international scientific community” (p. 481). Data on the proportion of papers written in international collaboration also shows that this proportion is lowest in fields that have more local values, like social sciences, engineering, clinical research, and highest in disciplines who are more universal in their objects like mathematics, physics and space science (Gingras, 2002). Variation is also seen at the country level, where countries with weaker scientific infrastructure tend to engage more heavily in international collaboration (Luukkonen, 1992). Another (perhaps more intuitive) finding was that “the larger the national scientific enterprise, the smaller the proportion of international co-authorship” (Frame and Carpenter 1979, p. 481). There is compelling evidence that the geographic proximity between the first and last author generate higher citations (Lee, Brownstein, Mills, & Kohane, 2010) and that international collaborations, in general, generate higher citations (Glanzel, 2001). Self-citations Self-citations have been critically viewed as a gaming mechanism in scholarly communication (MacRoberts & MacRoberts, 1989), and several studies have examined the prevalence of selfcitations at multiple levels of analysis, including papers, journals, individuals, and countries (Eto, 2003; Frandsen, 2007; Minasny, Hartemink, & McBratney, 2010; Snyder & Bonzi, 1998; Tagliacozzo, 1977). Early studies found that self-citations ranged from 8% at the individual level to 20% at the journal level (Garfield & Sher, 1963), while more recent studies have given percentages as high as 36% (Aksnes, 2003). In addition, some studies have demonstrated that the rate of self-citation fluctuates between disciplines (e.g., Bonzi & Snyder, 1990). Although complaints have been leveled at self-citation practices, Glänzel, Debackere, Thijs, and Schubert (2006) found that, at the macro-level, “there is no reason for condemning self-citations in general or for removing them from citation statistics” (p. 275). The citation advantage of co-authored works has been challenged on the grounds that it simply results from the “amplification” of the known practice of self-citation (van Raan, 1998): that is, if each author self-cites to the same degree that a single-author would, the citations to a co-authored paper should be multiplied by the number of co-authors. It should be no surprise that selfcitations increase with the number of co-authors (Wallace, Larivière, & Gingras, 2012), although it must be noted that this citation rate does not increase linearly (Glanzel & Thijs, 2004), suggesting that self-citation is not a sufficient explanation for the citation advantage of coauthored works. Instead, a specific citation impact seems to be associated with collaborative work (van Raan, 1998). Just as collaboration practices vary by discipline (Larivière, Gingras & Archambault, 2006), so too does the citation impact of collaboratively written works (Abramo, D’Angelo, & Di Costa, 2009; Pečlin, Južnič, Blagus, Sajko, & Stare, 2012). These studies often lack generalizability due to small sample sizes, disciplinary focus, and limited time periods for analysis. Additional largescale research is needed to identity the extent to which a citation advantage prevails across time and discipline. In addition, despite evidence of differences in self-citation by subspecialties, age, and (to a lesser extent) gender (Hutson, 2006), there is still much research to be done on the interaction between self-citation and impact. Methods The data for this paper are drawn from Thomson Scientific’s Science Citation Index Expanded (SCIE), Social Sciences Citation Index (SSCI), and Arts and Humanities Citation Index (AHCI) for the 1900-2011 period. We analyze 28,160,453 papers (articles, notes and reviews) and 484,393,178 citations received in Natural and Medical Sciences (NMS) as well as 4,347,229 papers and 30,587,347 citations received in the Social Sciences and Humanities (SSH). The evolution of the relationship between scientific impact and three types of collaborations is also presented. These three types of collaboration are: 1) co-authorship (i.e., number of authors), 2) interinstitutional collaboration (i.e., number of addresses), and 3) international collaboration (i.e., number of countries). While co-authorship data are presented here from 1900 onwards, interinstitutional and international collaboration data are only available from 1973 onwards, as it was in that year that Thomson Reuters’ predecessor -the Institute for Scientific Information -began to index institutional addresses in a consistent manner. Citations are counted from publication year until the end of 2011. In order to have a citation window of at least two years following the initial publication year, scientific impact data are presented up until 2009. To take into account different citation practices across subfields of science and publication year, all citation data were normalized according to the average number of citations received by the papers that were published in the same year and in the same speciality (average of relative citations – ARC). Accordingly, we have ensured that the “collaboration” variable is isolated and that the greater impact of collaborative research is not due to disciplines with higher collaboration rates having greater citation traffic. In order to assess whether selfcitations played a role in the (greater) impact of collaborative research, two types of fieldnormalized citation impact were compiled, one including self-citations and the other excluding self-citations. In the latter case, all authors’ self-citations – irrespective of their order in authors’ list – were excluded from each paper in the numerator as well as from the denominator itself (i.e., the average number of citations of all papers in the respective specialty that were published in the same year). We did not remove self-citations at the institutional or country level when analyzing the effect of the evolution of the number of addresses or countries on scientific impact, as it is individuals who cite, not institutions or countries. We also compiled data (not shown) on the top 5% most cited paper for each of the specialties—the trends were identical to those shown in the figures. Results Evolution of collaboration For all types of collaboration analysed, the numbers of authors, addresses, and countries were grouped into classes. Figure 1 presents the yearly evolution of the percentage of papers in each of the classes of number of authors, number of institutional addresses, and number of countries. It shows that, in all collaboration types, single author/address/country papers are decreasing, both in NMS and SSH, a finding that has been shown in other studies (Larivière, Gingras & Archambault, 2006; Wuchty, Jones, & Uzzi, 2007). More specifically, papers with one author accounted, in 1900, for 87% and 97% of all papers in NMS and SSH respectively; these percentage are, in 2011, 7% and 38% respectively. At the beginning of the period, the decrease of the proportion of single-author papers in NMS is due to the increase of papers with two authors; the proportion of the latter has also decreased since the beginning of the 1960s, mainly due to the increase of papers with more than two authors. The same phenomenon is observed for papers with 3 authors at the beginning of the 1980s. In 2011, paper classes with 4-5 and 6-10 authors account for about 29% and 27% of all NMS papers, respectively. Classes with 11-20 authors increased their proportion of papers by more than 2000% (0.22% to 4.5%) between 1980 and 2011, while papers with 21 authors or more increased by more than 1000% (0.04% to 0.5%) over the same period. In the SSH, all paper classes with more than one author increase, although, in 2011, the mode (i.e., the most common number of authors) is still one. 1 In order to have robust trends in the graphs, only ARC scores based on at least 100 papers are shown. Figure 1. Percentage of papers by classes of numbers of authors, addresses, and countries, for natural and medical sciences (NMS) and social sciences and humanities (SSH), 1900-2011 We also observe a decline of the proportion of papers with only one address, from 70% in 1973 to slightly more than 30% in 2011 in NMS and from 70% to 46% in SSH over the same period of time. In NMS, we observe a stabilization of the share of papers with two addresses while papers with three addresses or more are still increasing. Given that SSH journals’ editorial policies sometimes do not specify the institutional address of authors, approximately 30% of papers from 1975 lacked institutional addressed. This percentage has decreased steadily, although it still accounts for 8%. In NMS, papers without any address account for only 1% of all papers, down from 8% in 1973. Some of these papers are actually indexing mistakes, and generally happen in lower-tier journals. Papers with authors from two countries also increased their proportion of all published literature, accounting for 18% of NMS and 14% of SSH literature in 2011. In NMS, papers with three, fourfive, and six or more authors have increased by 2651%, 4969%, and 8365% between 1973 and 2011 while, in SSH, papers with three and four authors or more increased their share of all papers by more than 3000% and 4300%, respectively. Single country papers are still the mode, accounting for 77% and 84% of all papers in NMS and SSH, respectively. These data show an increase of all types of collaboration, both in NMS and SSH. The following sections will assess how these different types of collaboration and their intensity have influenced papers’ scientific impact over the course of the last decades. Inclusion vs. exclusion of self-citations It is commonly thought that the larger impact of collaborative research is due, at least in part, to authors’ self-citations (Herbertz, 1995). If a paper contains 20 authors, and each of the authors cites it at least once, then it accumulates 20 self-citations. At the other end of the spectrum, a paper with only one author, who cites the paper in a following publication, will result in only one self-citation. Data presented in Figure 2 show the gain in scientific impact (field-normalized citation rates) obtained by including self-citations as a function of the number of authors. More specifically, Figure 2 shows that, in both areas, there is a loss in citation impact when selfcitations are included for non-collaborative research (i.e., one author). In SSH, papers with two authors obtain similar normalized citation rates whether or not self-citations are included, whereas papers with at least three authors enjoy a steady increase in impact, gained from including self-citations. This gain reaches 20% at 25 authors, and oscillates around that percentage until 40 authors appear on the paper. In NMS, it takes many more authors in order to benefit from self-citations. Papers with fewer than four authors obtain, on average, lower fieldnormalized scores when self-citations are included than when they are excluded, which is due to the fact that the self-citations are excluded from both the numerator and the denominator. It is only when a paper has at least five authors that ARC scores including self-citations rise above those without self-citations. The gain in impact from self-citations is much smaller in NSE than in SSH for the same number of authors, which is likely a consequence of the lower number of citations in these disciplines. These results are consistent with those obtained by Aksnes (2003) in an analysis of Norwegian papers. In order to remove this self-citation effect—even though it is small—the data presented in the rest of the paper exclude all authors’ self-citations. Figure 2. Gain in average of relative citations (ARC) when self-citations are included, as a function of the number of authors in NMS and SSH, 2005-2009. Three-year moving averages. The relationship between scientific impact and the number of authors, addresses, and countries Figure 3 presents the field-normalized impact of papers for both NMS and SSH, excluding selfcitations, as a function of the number of authors, addresses, and countries. In NMS, the impact score of papers increases steadily with the number of authors until it reaches about 45 authors, where it flattens and oscillates, although it remains well above average. The same phenomenon is also observed in the social sciences. Given the lower proportion of collaborative research papers in those disciplines, as well as the lower number of papers involved, the impact score tends to oscillate at the level of 20 authors – although, again, it remains above average. Compared to SSH, many more authors are required in NMS in order to realize a given percentage of citation gain from collaboration. SSH start gaining citations with three authors with a roughly linear gain until 20 authors, whereas NMS papers gain citations starting with 5-author papers with no additional gain in the 10-20 author range. Similar trends are observed when considering the numbers of addresses and countries: the larger number of addresses and countries appearing on a paper the larger the impact. Unsurprisingly, papers with no address obtain the lowest impact.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Information sharing systems and teamwork between sub-teams: a mathematical modeling perspective

Teamwork contributes to a considerable improvement in quality and quantity of the ultimate outcome. Collaboration and alliance between team members bring a substantial progress for any business. However, it is imperative to acquire an appropriate team since many factors must be considered in this regard. Team size may represent the effectiveness of a team and it is of paramount importance to de...

متن کامل

On the Relation between the Small World Structure and Scientific Activities

The modern science has become more complex and interdisciplinary in its nature which might encourage researchers to be more collaborative and get engaged in larger collaboration networks. Various aspects of collaboration networks have been examined so far to detect the most determinant factors in knowledge creation and scientific production. One of the network structures that recently attracted...

متن کامل

بررسی میزان همکاری های علمی در مقالات قلب و عروق ایران در وبگاه علوم : 2002 تا 2011

Introduction: Scientific collaboration is a process in which two or more authors share their ideas, resources, and data to create a joint work. This research attempted to study co-authorship, being a kind of scientific collaboration, in Iranian cardiovascular articles in Web of Science during 2002-2011. Methods: The present research is. The population of this descriptive-analytical resear...

متن کامل

Team assembly mechanisms determine collaboration network structure and team performance.

Agents in creative enterprises are embedded in networks that inspire, support, and evaluate their work. Here, we investigate how the mechanisms by which creative teams self-assemble determine the structure of these collaboration networks. We propose a model for the self-assembly of creative teams that has its basis in three parameters: team size, the fraction of newcomers in new productions, an...

متن کامل

Collaboration Between Researchers and Knowledge Users in Health Technology Assessment: A Qualitative Exploratory Study

Background Collaboration between researchers and knowledge users is increasingly promoted because it could enhance more evidence-based decision-making and practice. These complex relationships differ in form, in the particular goals they are trying to achieve, and in whom they bring together. Although much is understood about why partnerships form, relatively little is known about how collabora...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

JASIST

دوره 66 شماره

صفحات -

تاریخ انتشار 2015

Team size matters: Collaboration and scientific impact since 1900

نویسندگان

چکیده

منابع مشابه

Information sharing systems and teamwork between sub-teams: a mathematical modeling perspective

On the Relation between the Small World Structure and Scientific Activities

بررسی میزان همکاری های علمی در مقالات قلب و عروق ایران در وبگاه علوم : 2002 تا 2011

Team assembly mechanisms determine collaboration network structure and team performance.

Collaboration Between Researchers and Knowledge Users in Health Technology Assessment: A Qualitative Exploratory Study

عنوان ژورنال:

اشتراک گذاری